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QUESTIONS FOR THE COMMITTEE 


What are your views on the usefulness of the approach taken, that is, to 
investigate a number of modelling options, ranging from simple to complex, for 
answering policy relevant questions and for checking the robustness of the 


results? 


What other discrete choice modelling approaches are suitable for assessing 
dynamics in a longitudinal (or panel) data? 


This is the first time ABS has included Average Partial Effects (APEs) in its 
research publications. In this study, the APEs are attractive in that they provide a 
more interpretable effect measure, which does not depend on the unobserved 
firm-specific effects, than simply reporting the probit coefficients. What are the 
views of the MAC members regarding their inclusion in other ABS studies? 


In future studies the ABS might be interested in extending these analyses by 
including survey weights. Given the discrete nature of the dependent variable, 
the complexity of the models, and the panel nature of the data, how should this 
be done? In particular, what adjustments are required in terms of modelling, 
standard errors adjustment, and bootstrapping? Also should the adjustments be 
made on a per year basis given that the longitudinal data span a few years? 


CONTENTS 


ABS PRAG Trivers ecesettta etter tote a it cuca ite Metals Raise hi obtes Move teas Mec telat Medes 1 
1. IN FRODUCTION» is csccitscincietance sien Gtete thesia loan eh tahaesesebateliieieta te dacivtan de cwusecnledveadeiee 1 
2: CONCEPTUAL BACKGROUND AND HYPOTHESES oo... ccceccccceeeeeeeesnetteeeeees 4 
Zl.” Oy aon ANGHIiS. DETSIStenCS ariimiieeiteanner nd tectum trader cede cinnmiedaanels 5 
2.2 Innovation and flexible working arrangeMent 0.0.0.0... ec eeeeee este eeteeeteeeteees 6 
2:3. Innovation ‘and information téechnOlO@y cic ccs daiekes vekhecukivnedonteddisecekcbenhcs fi 
2A. MOOvAationy And COMADOrAL OM a. dsdagte a vosendyasanseetarcerecomatinarenan wa rarean: 8 
3. MODELS ® eves tevt ei ccgheck i ib cce sega fists cee he gee bi cab cae Sea bad COCR cad aaah ah Pb cae eda daes ees tun aah dade 9 
4. MODEL -APPLIGATION yssssiscocnsasiesdgaca th ansanlywerasieagesansabedasatadaeeat ievedstaipnassaieneesaescuens sel 15 
Ast “IVESHIGC OM ona LTA ct cozds ta ahtbas cae anacielaxaiMh natin ba habiimy ient don lesa lion rae he decat 15 
Ad, “Overall Ov AlonPeSUllS Ay, cise vets suaccagsenaia Siriaas aanOhatienoe Barone: 19 
4.4. - Different ty Pes Of iMMOVAtION TESUMS .2i5sicseieisessendlessieattons viritesiablebdeecibatsiaces 20 
5 CONCLUDING. REMARKS is sss iies dase fovcniv sens ausues ace dun vctud ev anu taens dea Slgvauaaeess nbvtenedanday 23 
AGKNOWLEDGEMENIS \ sitcire. chek eantanateascateeseda shea, inn atv tien eke eet ds 24 
REFERENCES \ scsi saint atid steed dient aaa cada aia sae laste: 25 
APPENDIXES 
As ADATA:sGOMPILATION  sociisicdessss ahecsadbwedgiats beadiaiendeabia dhedadubmensiasyutedssasbedaddvengoheasuendios 29 
B. AVERAGE PARTIAL ERPEG PS: CAPES) spsccuetyatnoney icon dav tan acechen aig te sis ttonssiceaetdaneors 32 
Cx IMODEL- MEASURES fiasecaisases cotsasabedsaates conadianeasbat es dcdaacgeeasaates Sesadvemedsbas es Sesaacabedatatengenns 34 


The role of the Methodology Advisory Committee (MAC) is to review and direct research into the 
collection, estimation, dissemination and analytical methodologies associated with ABS statistics. 
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ABSTRACT 


This study explores a number of discrete choice panel data models to analyse the 
effects of three factors on innovation: flexible working arrangements, information 
technology, and collaboration, while controlling for the effects of other important 
variables, such as industry, business size, competition, and market location. The study 
also examines whether firms that innovated in the past are more likely to innovate in 
subsequent periods. 


The econometric models examined range from the relatively simple pooled model to 
the more complex dynamic probit model. The aim is to assess both the static and 
dynamic relationships and to take care of the categorical dependent variable and the 
longitudinal nature of the dataset. 


The analyses undertaken are in the context of the Australian small and medium-sized 
firms. Three waves of the ABS Business Longitudinal Database (BLD) are used: 
2007-2008, 2008-2009, and 2009-2010. 


1. INTRODUCTION 


In recent times, there has been a considerable increase in the number of longitudinal 
(or panel) data analyses. One main reason is the greater availability of such data. 


The advancement of statistical methodologies and greater computing power have 
enabled the ‘creation’ of rich longitudinal datasets, by combining and linking different 
datasets, where, for example, employees are linked to employers, survey data are 
combined with administrative datasets, and different waves of the same survey are 
linked over time. In addition, longitudinal data can now be constructed from existing 
administrative data collections, in addition to repeatedly collected survey data. As this 
trend, towards bigger, richer, and more dynamic data, is expected to continue, it is 
important for statistical agencies to equip themselves with the adequate analytical 
skills and statistical and computing tools to adequately process and analyse these 
datasets. 


Longitudinal data are attractive in that the data are typically very rich and by blending 
both time and cross-sectional characteristics, the data provide a lot of opportunities 
for analysis that are often not available with simple cross-sectional or time-series data. 
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Longitudinal data allow for changes over time to be analysed at the unit, rather than 
aggregate, level. This greatly enhances the data available for individual-level 
modelling. Important advantages include the increased precision and efficiency in 
estimation, the possibility of controlling for the impact of omitted variables, the 
analysis of more complex behaviours, and the possibility of incorporating dynamic 
effects in the model. (See Cameron and Trivedi, 2005 and Hsiao, 2007 for more 
details.) 


However, despite these advantages, extracting valid and meaningful inferences from 
longitudinal data can prove very challenging. This is especially the case when dealing 
with non-linear models, which is usually the case if the dependent variable is discrete. 
In these instances, the models are usually complex, the statistical inferences 
sophisticated, and the well-established approaches of dealing with the incidental 
parameters (which are typically implemented in linear models) do not always work. 
Rather, the analyst often needs to deal with non-additive heterogeneity, make 
assumptions about the interactions between the observed and unobserved covariates, 
address difficult asymptotic theory, and implement modern techniques such as 
bootstrapping to make valid statistical inferences. 


Focus of this paper 


It is in this context, that this study explores the implementation, estimation, and 
performance of different discrete choice longitudinal data models using the Main Unit 
Record File (MURF) of the ABS Business Longitudinal Database (BLD).' The empirical 
analysis is based on modelling the behaviour of innovation using firm-level data 
comprised of Australian small and medium-sized enterprises (SMEs). Three waves of 
the BLD are used: 2007-2008, 2008-2009, and 2009-2010. 


A range of econometric models are examined that are aimed to assess both the static 
and dynamic relationships and to take care of the discrete dependent variable and the 
longitudinal nature of the dataset. The models include the following: 


e the pooled model, 
e the standard random effects model, 


° the correlated random effects model (using the specifications of Mundlak, 1978 
and Chamberlain, 1984), 


° a standard dynamic model; and 


° a dynamic random effects probit model that follows Wooldridge (2005) to deal 
with the initial conditions problem. 


1 For more information on the BLD, see ABS (2010). 


2 ABS ¢ DISCRETE CHOICE PANEL DATA MODELLING USING THE ABS BUSINESS LONGITUDINAL DATABASE * 1352.0.55.139 


ABS METHODOLOGY ADVISORY COMMITTEE * NOVEMBER 2013 


As both relatively simple as well as complex models are implemented, it is of interest 
to see how the results differ across models and whether the findings of previous ABS 
studies that examined the same relationships using cross-sectional data, collected by 
the same survey, are supported by the models run on the longitudinal data. 


The main focus is on analysing the effects of three key factors on innovation: flexible 
working arrangements, information technology, and collaboration, while controlling 
for the effects of other important variables, such as industry, business size, 
competition, and market location. The study also examines whether firms that 
innovated in the past are more likely to innovate in subsequent periods — i.e. whether 
there is ‘persistence in innovation’ (Clausen et al. , 2012). 


The study uses an output-based measure of innovation and follows the OECD 
approach in differentiating between four types of innovation: new goods and services, 
new organisational processes, new operational processes, and new marketing 
methods. To the best of the author’s knowledge, this is the first study that examines 
the persistence of innovation by distinguishing between the four types of innovation 
using Australian firm-level survey data. 


Apart from the main relationships, the different models are intended to assess the 
following: (1) the lag effects and the importance of the initial conditions; (2) the 
importance of controlling for the unobserved firm-specific effects; and (3) the 
independence assumption between the regressors and firm heterogeneity. 
Incorporating and testing these model aspects is important in that they give an 
indication of whether omitting them (as it is done with standard cross-sectional 
analyses) has any bearing on the results. 


From the perspective of the ABS, this study is important for a number of reasons, 
including capability building in longitudinal analysis, the exploration of the 
longitudinal aspect of the BLD (this is the first longitudinal study conducted on the 
BLD MURB), and the extension of the previous ABS cross-sectional analyses to the 
longitudinal front”. Given the categorical nature of some of the data items collected 
by the ABS and the potential of more panel data work in the future, these methods 
have the potential of being used in other ABS outputs. 


The paper is structured as follows. Section 2 presents a short conceptual background 
to the empirical application. In Section 3, the theory of the models implemented in 
the analysis is described. Section 4 applies the models and describes the results. 
Section 5 concludes. 


2 In particular the extension is for the following ABS studies: Todhunter and Abello (2011), Soames et al. (2011), 
Soriano and Yong (2011), Rotaru ef al. (2013), and Tiy et al. (2013). 
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2. CONCEPTUAL BACKGROUND AND HYPOTHESES 


In a global economy, innovation plays a key role in the success and competitiveness of 
businesses and corporations. Its benefits are far-reaching and they can extend from 
the innovating firms to the welfare of the society and to the whole economy (DISRTE, 
2012). Innovation is often regarded as a pivotal engine of output and productivity 
growth, employment performance, and international competitiveness (Stokey, 1995; 
Aghion et al., 2013; OECD/Eurostat, 2005) and for some as “the cornerstone of 
economic growth” (White House, 2011). 


Following the writings of Joseph Schumpeter, a prolific amount of research has been 
undertaken on examining, both empirically and theoretically, the innovation process 
and the factors that affect innovation. These include information technology, 
collaboration, competition, labour market flexibility, productivity, as well as others. 


One area of interest, which has received limited empirical attention, is on the 
relationship between flexible working arrangements and innovation. As emphasised 
in a recent report of the White House Council of Economic Advisors (White House, 
2010) the limited empirical research on the effects of these arrangements is mainly 
due to a lack of appropriate data. Other studies have also highlighted the need for 
more research in this area, for example Martinez-Sanchez et al. (2008) and Zhou et al. 
(2011). One important question is whether, and to what extent, a flexible working 
environment influences innovative thinking and the likelihood of a firm to innovate. 


Information technology and collaboration are two other factors that play key roles in 
influencing innovation. Similar ABS studies using the ABS Business Characteristics 
Survey (BCS)° found a positive and significant relationship between innovation and 
information communication technology (Todhunter and Abello, 2011; Rotaru et al., 
2013; and Tiy et al., 2013) and between innovation and collaboration (ABS, 2008 and 
DITR, 2006). One limitation however, is that all these former studies were cross- 
sectional in nature and therefore were not able to capture the dynamic behaviour of 
firms. As found in Martinez-Ros and Labeaga (2002), the significance of some 
determinants of innovation can vanish after controlling for the dynamic effects. 


In what follows, this section describes the key variables and relationships examined in 
this study and summarises the hypotheses to be tested. Given that some of these 
analyses where conducted previously by the ABS (although, only at a cross-sectional 
level), the readers are directed to the specific ABS studies for a more thorough 
coverage. 


3 For more information on the BCS see Selected Characteristics of Australian Business, 2009-2010 (ABS, 2011b). 
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Four influences to innovation are analysed in this paper, namely: 
1. Innovation persistence; 

2. Flexible working arrangements; 

3. Information technology; and 


4, Collaboration. 


2.1 Innovation and its persistence 


Innovation has many dimensions and is a complex phenomenon to define and analyse 
(OECD/Eurostat, 2005). Amongst others, innovation can differ with respect to the 
degree of novelty (new to the firm, new to the industry, new to the country, or new to 
the world), type of innovation (new goods or services, new operational processes, 
new organisational processes, or new marketing methods), and degree of 
implementation (successfully implemented, ongoing, or abandoned). Part of its 
complexity stems from its multidimensional nature, continuous process (i.e. 
innovation keeps on upgrading), dynamic and non-linear behaviour, and complex 
diffusion process. (See Fagerberg et a/., 2010; DIISRTE, 2012; OECD/Eurostat, 2005 
for more details.) 


In defining innovation, this study uses the internationally recognised definition given 
in the Oslo Manual, where innovation is defined as: 


“... the implementation of a new or significantly improved product (good or service), or 
process, a new marketing method, or a new organisational method in business practices, 


workplace organisation or external relations.” (OECD/Eurostat, 2005, p. 46) 


Due to its importance for the success of firms and for the whole economy there has 
been considerable research done on different fronts of this complex phenomenon. 
One such front is on understanding the dynamic behaviour of innovation. This area of 
research aims to assess whether the firms that innovated in the past are more likely to 
innovate in subsequent periods, or as it is termed in some studies, whether innovation 
is persistent or path-dependent. 


There are different theories to explain why this might be the case. Clausen et al. 
(2012) lists three main lines of reasoning identified in previous research that support 
the dynamic behaviour of innovation. The first relates to the notion that “success 
breeds success”. The idea is that as firms successfully innovate, they also become 
more profitable, as innovation generates more profits, which in turn leads to further 
innovation. The second line of reasoning relates to the idea of learning-by-doing, 
where firms learn from the innovation process to be more effective in dealing with 
issues and in solving problems. The theory also rests on the idea that knowledge is 
cumulative and as such, by innovating, a firm expands its knowledge base, which then 
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can be used to innovate more successfully in the future. The final theory claims that 
innovation is persistent because of the absorptive nature of the R&D. The idea is that 
because knowledge is absorbed by the human capital, it is unlikely that a firm invests 
in R&D and then pursue a one-off type of innovation. 


A number of studies found strong evidence of such persistence, for example Martinez- 
Ros and Labeaga (2002) and Clausen ef a/. (2012). Note that due to the relatively 
short panel, the focus is only on the one-period lag effects. The first hypothesis to be 
tested is thus: 


H1: Firms that innovated in period ¢—-1 are more likely to innovate in period ¢. 


Note that as different types of innovation may have different behaviours, the study 
distinguishes between the four types of innovation mentioned above. Note however 
that the study does not disseminate the results with respect to the different degrees of 
novelty or the degree of implementation. 


2.2 Innovation and flexible working arrangements 


To gain and maintain a competitive edge in the global and the ever increasing 
competitive environment, firms often need to quickly and adequately adapt to the 
fast-paced labour market changes. Labour market flexibility plays an important role in 
addressing this challenge (European Commission, 2007; OECD, 1994). 


Although there are different views regarding what constitutes labour market flexibility, 
one widely-used categorisation of the concept is given by Atkinson (1984) and 
Atkinson and Meager (1986). These studies categorise labour market flexibility into 


four types: 

° external numerical (i.e. flexibility in adjusting the labour intake from the external 
market), 

° internal numerical (i.e. flexibility in adjusting the working hours or schedules of 
employees, such as flexible working hours or shifts, part-time, or leave 
flexibility), 

° functional (i.e. flexibility in transferring employees to different tasks and 


activities), and 
° financial or wage flexibility (i.e. flexibility in deciding wage levels). 


In addition to these, there are other types of flexibility, one of which is the location 
flexibility or flexibility of place (e.g. home-based work) (Reilly, 2001; Wallace, 2003). 


Most of the studies that examined labour market flexibility have concentrated on the 
strategies employed by businesses to deal with and adapt to the changes in the labour 
market. The idea is that by quickly adapting to the fluctuations in the labour market, 
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businesses are more competitive and more likely to innovate. (See Martinez-Sanchez 
et al., 2008; Zhou et a/., 2011 and Chung, 2009 for a theoretical and a literature 
review.) 


On the other hand, flexibility can also refer to the business strategies to meet the 
needs of its employees. The idea is that by creating an appropriate environment, 
where key capabilities are promoted and nurtured, employees are more committed 
and more likely to innovate. (See Storey et al., 2002 and Chung, 2009 for more 
details.) 


Although the two views seem to be mutually exclusive, that need not be the case. As 
stated in a recent European Commission report (European Commission, 2007), labour 
market flexibility can be used by businesses to address both the business need for 
adapting to the fast changes in the market, as well as the employees’ needs of working 
in a productive and secure environment. 


This study looks at four types of flexible working arrangements: flexible working 
hours, flexible leave, job sharing, and working from home. To the best of the author’s 
knowledge this is the first study to examine the relationship between flexible working 
arrangements and innovation using Australian longitudinal business survey data at the 
micro-firm level. 


The hypothesis is thus: 


H2: Firms that have flexible working arrangements are more likely to innovate. 


2.3 Innovation and information technology 


Information technology is often regarded as a major driver of innovation (White 
House, 2011; Atkinson and Andes, 2009). This is particularly true in the current 
environment where businesses are faced with fast-paced technological advancements 
and tough global competition. Although the role of ICT as a source of business 
innovation is not explained in this report, the readers interested are directed to the 
study of Todhunter and Ruel (2011). 


This study examines whether having intensive information technology systems, 
summarised by an information technology index, improves the likelihood to innovate. 
It is of interest to observe whether the positive link between innovation and 
information technology found by previous studies using the ABS BCS (i.e. Todhunter 
and Abello, 2011; Rotaru et al. , 2013; Tiy et al., 2013) is maintained when using 
longitudinal data. The hypothesis to be tested is thus: 


HG: Firms that have higher information technology intensity are more likely to 
innovate. 
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2.4 Innovation and collaboration 


The relationship between collaboration and innovation is another aspect that this 
study looks at. As it has been pointed in numerous studies, collaboration plays a key 
role in affecting innovation (see ABS, 2008 and DITR, 2006 for more details). The 
hypothesis to be tested is thus: 


H4; Firms that collaborate are more likely to innovate. 
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3. MODELS 


This section highlights the theoretical underpinnings of the models used in analysis. 
In this general setting let subscript 7 index firm 7 (referring to the cross sectional 
aspect of the data) and ¢ the time period. For firm 7 at time ¢ let x,, be the vector 
of (observed) explanatory variables, y,, the (observed) dichotomous outcome 
variable, and ¢,, the residual term. The general model (assuming a balanced panel) 
can be represented as 


Pl y=) ae) =F (aR) wheter Ha 125.57 a1 a 3.1) 


and F(-) is an appropriate cumulative density function for a dichotomous response 
variable, which in this study is a probit distribution function, and 


Vit = 1(xi,8 + Ey > 0), (3.2) 


1(-) denoting the 0—lindicator function. 


Note that the model described by equation (3.1) only specifies the marginal 
distribution of y,, and therefore leaves the joint distribution P(yj1,..., V7 ) 
unspecified. There are two important reasons why Pl Wes Wie4) # P(e )P( Vier ; 
The first is due to the unobserved firm specific effects (i.e. unobserved 
heterogeneity), which have the potential to affect the outcome of interest. The 
second type is due to lag effects of the dependent variable, i.e. true state dependence. 
(See Amemiya, 1985, chapter 9 for more details.) 


Both types of dependence can be observed in equation (3.3), where c,; captures the 
unobserved heterogeneity corresponding to firm 7 and the coefficient @ whether 
there is state dependence or not. 


P( yu = 1| Vigarer ior BiG | = F( p¥i-1 + xB + ¢;) (3.3) 


A few considerations are worth noting about the specifications of model (3.3). First, 
the dynamics have a relative simple structure, representing a first-order Markov 
process in that only the first lag is included. Second, the heterogeneity in the model 
is additive, in a functional form. Third, only x,, appears in the model, although 

X; = (Xj1,--.,X;,) is in the conditioning set. 


In what follows, this section briefly presents the five models that were used in the 
analysis. The models are classified according to the way they deal with the two types 
of dependence aforementioned and the specification of the joint probability. 
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Model 1: The Pooled Model 


One simple way of specifying the joint probability is by assuming some form of 
independence, in the sense that 


P( Vie| Yagp—to--+s Yo0o%e | 7 P( yin | Xi) (3.4) 


In this case the joint probability follows directly from the marginal probability by 
simply multiplying the individual marginal probabilities. Mathematically this can be 
written as 


P( Vays Vir | ;) = [[20% | Xi) 
t 

Although the model does not directly deal with either the unobserved firm specific 
effects or the state dependence of the outcome variable, one can make adjustments 
for these by computing panel-robust standard errors. The model is attractive in that it 
is simple to implement and interpret and because robust standard errors can be 
obtained without imposing specific functional forms. The model has been widely 
used and provides a good reference for the other models considered. 


Model 2: The Standard Random Effects (RE) Model 


A different approach, which has become very popular, is to incorporate the 
unobserved heterogeneity directly in the model. The model assumes that 


P( Vit| Yip—t9--+ Vio» %ivG) = P( ¥it| %is¢i)- (3.5) 


From (3.5), it immediately follows that 


P( Vas Ver | %i5G) = []?0l xe): 


t 
One problem with the conditional probability above is that although it is conditioned 
on c;, one does not observe the firm specific effects. This raises the important 
consideration of how to treat the unobserved effects. Two important aspects that 
need to be considered include the distribution of the unobserved effects and the way 
the unobserved heterogeneity enters into the joint probability, i.e. the relationship 
with the other variables in (j.,X; ). 


With regards to its relationship with y,, most applications consider an additive 
functional form, where the relationship between y,, and c;, is linear and additive, in a 
functional form, similar to the relationship described by equation (3.3). With regards 
to x,, the traditional random effects model assumes independence between c; and 
x,,. An alternative is to assume some form of dependence in the form of a specified 
relationship as shown in the next model. 
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Finally, for the distribution, most applications assume a normal distribution. In 
notational form the conditional distribution of c; is given by 


C,| x; oe N(0,02). 


To obtain the joint probability P( Vitvene Vay | Xie) one can integrate out the firm 
specific effects. 


This model although widely used in practice, rests on a strong independence 
assumption between the regressors and the unobserved heterogeneity. It also does 
not include lag effects. The model is nonetheless attractive in that it controls for the 
unobserved firm specific effects in a simple and intuitive way, treating the unobserved 
effects as a random variable with a definite distribution. 


Model 3: The Mundlak/Chamberlain Random Effects Model 


The previous model is restrictive in that it does not allow for any correlation between 
c; and x;,. To relax this assumption, one can follow the approaches of Mundlak 
(1978) and Chamberlain (1984) and allow for a specific type of dependence between 
firm-specific effects and the explanatory variables, namely, one that is linear and that 
has a normally distributed functional form. Using notation, the relationship can be 
expressed as 


= ’ 
C= C+ zy tu; 
where c;| Zi ~ N(c + Zi, o:;] 


and where z; refers to the group/cluster means of the time-varying variables in xj, 
(i.e. X,), in the case of the Mundlak (1978), or to the vector of time-varying x,, across 
all time periods, i.e. 2; = x; := (x;1,.--,%;r) , in the case of the Chamberlain (1984) 
specification, and o? is the conditional variance of ¢; | x;. It follows that 


Pl yu | Viger Io Zine | = P( Wit | 25C)) = F(x, B+c+ zy +u;) (3.6) 


The estimation of the model is very similar to that of the traditional random effects 
probit model. This immediately follows since x, and z,; are observed and since it is 
assumed that 1; | oe N(0, ee) 


Although somehow restrictive in the sense that one needs to specify the dependence 
between the unobserved effects and the regressors, the model is attractive in that it 
does not impose the strong independence assumption of the standard RE model. 
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Model 4: The Standard Dynamic Model 


The previous models deal with the dependence that comes from the firm specific 
effects and do not make direct allowance for the lag terms (although, some 
adjustments are usually made in computing standard errors that are robust to serial 
correlation). In some cases, modelling this dynamic relationship is important for 
analysis. This and the next model are aimed to address this. 


In terms of the general model described by equation (3.3), the standard dynamic 


model assumes that 


P( yu = 11¥;,1-15--+Fi0,%s) = F( p¥i4-4 oa x},B) 


Similar to model 1, some corrections are usually required in the form of computing 
panel-robust standard errors. Once again, given its simplicity, the model is a good 
reference for other more complex dynamic models. 


Model 5: The Wooldridge Dynamic Model 


This model aims to model both types of dependence, i.e., that coming from the 
unobserved firm specific effects as well as the persistence of the outcomes. In 
general, the model is described by (3.3), which is also included below: 


P( yu = 194 p-19-- Yio»*G) = F( py 4-4 + xB + c;) 


One additional challenge in this case is that the unobserved firm effects, captured by 
c;, are likely to be correlated with y;,_;. With a relatively short panel the initial 
conditions, y,, are likely to play an important role and ignoring this correlation 
might not be sensible. One relatively simple solution is the approach suggested by 
Wooldridge (2005), where the idea is to model the conditional joint distribution 


P( Vas Vias-s Dir | ViorXirCr) 
rather than P( V0 Vir» Yias--+s Var | Hise; ) 


Similar to the Mundlak/Chamberlain specification, the approach specifies c; in the 
same way with the difference that now it also includes the initial conditions. This can 
be expressed as 


ed ' 
C; | Zi V0 =V+ ZY + Si + % 


where a; ~ N(0, 0] and where a; is independent of y,) and 2;. 
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It immediately follows that 


Vit = 1(y + SoVio + ZY + PVig—1 + XB + 4; + Ey > 0) 3.7) 


where €, ~ N(0,1). 


As y;, conditional on ( Vignes Viet» %114;) follows a probit distribution and as a; are 
normally distributed, the model is similar to the previous Chamberlain/Mundlak 
Random Effects Model. The difference here is that the conditioning set also includes 
y;q and that the lag effects are also included. 


The table below briefly summarises the five models discussed. Note that the summary 
is not exhaustive. 
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3.1 Summary of the five models* 


Model 1 


Mode! 2 


Model 3 


Model 4 


Model 5 


Treatment of 
unobserved 
heterogeneity (@,) 


Inclusion of lag 
effects 


Allowance for 


correlation between 


a, and covariates 


Advantage 


Complexity 
(implementation 


and interpretation)* 


Ignored; panel- 
robust standard 
errors are 
computed 
instead. 


Not applicable. 


The estimated 
coefficients can 
be inconsistent 
if the true model 
has individual- 
specific random 
effects. Also the 
estimators can 
be inefficient. 


The model is 
relatively simple 
to use and 
implement. No 
need for 
distributional 
assumptions of 
the firm-specific 
effects. 


Treated as a 
random variable 
with a specified 
distribution. 


Not included. 


Assumes 
independence. 


The estimated 
coefficients can 
be inconsistent 
if the individual- 
specific effects 
are correlated 
with regressors. 
The model 
requires 
distributional 
assumptions for 
firm-specific 
effects. 


The model is 
relatively simple 
and it makes 
direct allowance 
for individual- 
specific effects. 


Treated as a 
random variable 
with a specified 
distribution. 


Not included. 


Allows for 
correlation 
between a, and 
the covariates. 


The estimation 
and 
implementation 
of the model are 
more complex. 
The model 
requires 
distributional 
assumptions for 
firm-specific 
effects. 


Similar to model 
2. The model 
also allows for 
correlation 
between a, and 
the regressors. 


Ignored; panel- 
robust standard 
errors are 
computed 
instead. 


Not applicable. 


Same as model 
1. 


Similar to model 
1 but it also 
includes lag 
effects. 


Treated as a 
random variable 
with a specified 
distribution. 


Similar to model 
3 but it also 
includes the 
correlation 
between a, and 
the initial 
conditions. 


The estimation 
and 
implementation 
of the model are 
much more 
complex. The 
model requires 
distributional 
assumptions for 
firm-specific 
effects. 


Similar to model 
3 but it also 
includes lag 
effects. 


* = Relative complexity across the five models, with ranking of 1 standing for the least complex model, while 5 
for the most complex. 


4 See Baltagi (2008), Green (2012), Wooldridge (2010), and Cameron and Trivedi (2005) for more details 
regarding the technical details of the models. 
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4. MODEL APPLICATION 


This section has three parts. The first discusses the methodological findings of the 
five models implemented. These are based on the results for overall innovation (i.e. 
any type of innovation) and the average partial effects (APEs).? The second part 
focuses on the empirical application and discusses the effects of the key variables on 
innovation. The results are compared to those of other ABS studies that used cross- 
sectional data collected by the same survey. The final part gets a closer look at the 
relationships by disaggregating innovation into four categories: new goods and 
services, new operational processes, new organisational processes, and new marketing 
methods. For brevity, only the Wooldridge Dynamic Probit model results are 
reported in this final part. 


In all models the reference firm belongs to the Manufacturing industry, is very small, 
does not have any type of flexible working arrangements, faces no competition, has 
most intense ICT, operates only locally, and does not collaborate. The reference year 
is 2007-2008. In choosing the variables to be included in the models the study 
followed the aforementioned similar cross-sectional ABS studies. The interested 
readers are referred to these studies for more details. 


4.1 Methodological findings 


Building on the theory presented in Section 3, five models were adopted, three 
non-dynamic: the Pooled model (model 1), the standard Random Effects (RE) 
model (model 2), and the Chamberlain/Mundlak RE (model 3); and two dynamic: 
the Standard Dynamic Probit model (model 4) and the Wooldridge Probit model 
(model 5). 


To control for the presence of heterogeneity and/or serial correlation, panel-robust 
standard errors were computed for all models either by clustering (when this option 
was available in the software package) or by bootstrapping for panel data. For models 
3 and 5 both the Chamberlain and the Mundlak specifications were investigated, 
where apart from some minor differences the results were similar. Due to their 
similarities, only the Mundlak results are reported. Table 4.1 presents the APEs for the 
five models considered, whereas table 4.2 the results for overall innovation. 


Average partial effects 


A number of points are worth noting on the APEs results. These results have the 
advantage of being comparable across models, an advantage which is often not 
preserved with the regression coefficients (see Wooldridge, 2010, chapter 15 for more 
details). It is interesting to note that with a few exceptions, the results are not too 


5 See Appendix B for a brief theoretical exposition of the APEs and Wooldridge (2010) for more details. 
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different across the five models. In particular, the sign and the significance are all the 
same. The biggest difference in the magnitude of the coefficients is that between the 
lagged innovation of model 4 and that of model 5. This difference can be attributed to 
two factors. The first being the initial conditions which are directly incorporated in 
model 5, but not in model 4. This is reflected in the results presented in table 4.2, 
where the coefficient for initial conditions, is highly significant and is even higher than 
that of lagged innovation. The second reason is due to the different treatment of the 
unobserved heterogeneity: model 5 incorporating the firm-specific effects directly in 
the model, whereas model 4 adjusting the standard errors. 


4.1 Average Partial Effects (APEs) 


Model 1 Model 2 Model 3 Model 4 Model 5 
Variables* Pooled Standard RE Mundlak Dynamic Dynamic RE 
Innovation (t-1) ae a std Hy ee oy 0.380 (0.015) 0.113 (0.025 
Collaboration 0.138 (0.020) 0.119 (0.018) 0.165 (0.028) 0.104 (0.016) 0.126 (0.025 
Flexible working arrangements 
Flexible work hours 0.085 (0.017 0.079 (0.016) 0.091 (0.027 0.068 (0.014) 0.071 (0.023 
Flexible leave 0.066 (0.017 0.048 (0.016) 0.096 (0.025 0.041 (0.014) 0.071 (0.024 
Job sharing 0.064 (0.022) 0.045 (0.020) 0.080 (0.034 0.053 (0.018) 0.066 (0.029 
Working from home 0.030 (0.019 0.029 (0.017) 0.033 (0.026 0.020 (0.014) 0.032 (0.022 
ICT intensity** 
Low —0.204 (0.040) -0.170 (0.036) -0.250 (0.074) -0.116 (0.033) -0.156 (0.064) 
Moderate -0.191 (0.020) -0.157 (0.019) -0.229 (0.025) -0.110 (0.016) -0.144 (0.022) 
High —0.126 (0.025) -0.095 (0.023) -0.177 (0.038) -0.070 (0.020) -0.118 (0.033) 
* = Overall Innovation being the dependent variable 


** = Comparative to the most intense ICT intensity 
Standard Errors included in brackets (Computed using bootstrapping with 200 replications) 


Regression results 


Next consider the results presented in table 4.2. One important consideration is the 
proportion of the combined variance that is attributed to the panel-level variance 
component. With the current data, the unobserved effects for both model 2 and 3, 
are significantly different from zero (on the basis of the likelihood-ratio test results) 
and account for more than 56% of the variance of the composite error, whereas for 
model 5, this proportion is around 37%. These results indicate that controlling for 
unobserved effects is important in the current analysis. 
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4.2 Regression results for the five models for (overall) innovation 


Model 1 Model 2 Model 3 Model 4 Model 5 

Variables Pooled Standard RE Mundlak Dynamic Dynamic RE 
Innovation (t-1) 1.076 *** 0.423 *** 
Innovation (t=O) 0.746 *** 
Industry (Manufacturing) 

Mining -0.058 -0.095 —0.066 -0.034 -0.028 

Construction -0.316 *** 0.529 *** -0.487 *** -0.199 ** -0.272 * 

Wholesale -0.121 -0.168 -0.204 —0.062 -0.103 

Retail -0.042 -0.066 —0.103 0.008 -0.022 

Accommodation -0.243 ** -0.406 ** -0.403 ** -0.155 * -0.209 

Transport -0.278 *** -0.459 *** -0.417 *** -0.215 *** -0.311 ** 

Telecommunications -0.052 -0.082 -0.117 -0.053 -0.103 

Real Estate -0.175 -0.251 * -0.347 ** -0.137 -0.221 * 

Professional -0.122 -0.172 -0.212 -0.091 -0.136 

Administrative -0.312 *** 0.484 ** -0.485 *** —0.218 ** -0.354 ** 

Recreation —0.243 ** -0.380 ** -0.372 ** -0.130 -0.180 

Other Services 0.026 -0.035 0.038 0.048 0.074 
Size (Very small) 

Small 0.052 0.135 0.036 0.040 0.052 

Average 0.140 ** 0.323 *** 0.124 0.104 ** 0.092 
Flexible work hours 0.239 *** 0.326 *** 0.259 *** 0.218 *** 0.249 *** 
Flexible leave 0.186 *** 0.201 *** 0.025 O:131.4** 0.015 
Job sharing 0.184 *** 0.189 ** 0.099 0.174 *** 0.108 
Working from home 0.085 0.119 * 0.032 0.066 0.025 
Competition (No competition) 

Minimal 0.142 0.183 -0.015 0.121 0.016 

Moderate or Strong 0:270°*** 0.276 ** -0.073 0.188 ** —0.040 
ICT intensity (Most intense) 

Low -0.556 *** -0.689 *** -0.170 0.364 *** -0.128 

Moderate -0.522 *** -0.638 *** -0.043 -0.344 *** -0.038 

High 0.344 *** -0.390 *** -0.140 0.221 *** -0.094 
Market location (Only local) 

Only overseas -0.456 ** -0.779 ** -0.823 ** —0.420 ** -0.594 ** 

Both local and overseas 0:232:*s* 0.328 *** 0.240 ** 0.176 *** 0.172 ** 
Financial year (2007-2008) 

2008-2009 -0.232 *** -0.340 *** -0.313 *** -0.359 *** -0.333 *** 

2009-2010 -0.167 *** -0.241 *** —0.216 *** -0.193 *** -0.205 *** 
Collaboration 0.394 *** 0.502 *** 0.339 *** 0:33 7 *** 0.324 *** 
Intercept -0.063 -0.019 -0.164 -0.554 *** -0.672 *** 
Group Means 
Flexible hours 0.132 0.027 
Flexible leave 0.389 *** 0.262 ** 
Job sharing 0.254 0.153 
Working from home 0.111 0.103 
Competition (No competition) 

Minimal 0.217 0.109 

Moderate or Strong 0.598 ** 0.359 * 
ICT intensity (Most intense) 

Low —0.880 ** -0.461 

Moderate -0.915 *** -0.506 *** 

High -0.603 *** -0.352 ** 
Collaboration 0.391 ** 0.172 
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4.2 Regression results for the five models for (overall) innovation (continued) ® 


Model 1 Model 2 Model 3 Model 4 Model 5 
Variables Pooled Standard RE Mundlak Dynamic Dynamic RE 
Log Likelihood -3,370.4 -3,078.8 -3,029.8 -2,971.4 -2,898.3 
AIC 6,798.9 6,217.6 6,139.6 6,002.9 5,880.7 
BIC 6,990.5 6,415.8 6,404.0 6,201.1 6,158.2 
Sigma 1.135 1.154 0.768 
rho 0.563 *** 0:571.4** 0:371.*** 
Observations (n) 5,481 5,481 5,481 5,481 5,481 


*** = significant at the 0.01 level; ** = significant at the 0.05 level ; * = Significant at the 0.10 level; 
Reference category included in brackets. 


In addition, the results for models 3 and 5 indicate that the allowance for correlation 
between the firm-specific effects and the regressors is important for the current 
data. In particular, the group means for flexible leave, all ICT intensities, and one 
category for competition are all significant. As model 2 is a specific type of model 3, 
i.e. the case where the heterogeneity vector is perpendicular on the vector of 
regressors, the Wald and the log likelihood tests were conducted to test the null 
hypothesis that assumes that all the coefficients for the group means are equal to 
zero, case when the Mundlak model becomes the Standard RE model. Both tests 
strongly rejected the null hypothesis at the 5% significance level. This indicates that 
some allowance for the correlation between the firm heterogeneity and regressors is 
favourable by the current data. 


For the dynamic models (models 4 and 5), the first order lag terms are significant and 
positive. For the Wooldridge model, the initial conditions (innovation at t=0), as well 
as some of the group means are also positive and significant. These results indicate 
that the persistence of innovation hypothesis is supported, i.e., a firm that innovated 
in the previous period has a higher likelihood to innovate, and that it is important to 
control for the firm-specific effects. To confirm the results, tests were conducted on 
the joint significance of the extra variables added in model 5 (i.e. comparing model 5 
to model 4 and model 3, respectively). In all cases, the tests rejected the null 
hypotheses, giving support to model 5. 


Finally, based on the AIC and BIC results, model 5 is most favoured, followed by 
model 4, model 3, model 2, and finally model 1. (See Appendix C for the definition of 
these criteria.) 


6 The author also examined the potential for endogeneity between organisational innovation and flexible 
working arrangements, however, the investigations on the sample did not support such a claim. The models 
were also run excluding organisational innovation from overall innovation; the results were similar. 
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4.2 Overall innovation results 


The overall innovation results for the five models are presented in table 4.2. Across all 
models, the three key variables of this study (collaboration, flexible working 
arrangements, and ICT) play an important role in explaining the innovation behaviour. 
Collaboration is significant at the highest level and positive. This supports the 
findings of previous ABS-related studies that used cross-sectional data (ABS, 2008; 
Rotaru et al., 2013; and DITR, 2006). 


For the flexible working arrangements indicators, providing flexible work hours and 
flexible leave are both significant and positive.’ Job sharing is also positive and is 
significant for some of the models. Working from home, although positive, is not 
significant for any of the models. These findings are in line with those of Rotaru et al., 
2013 (the only other ABS study that included flexible working arrangements in the 
innovation model), with the exception of the working from home variable which was 
found to be significant in the former study. 


The ICT intensity categories are also significant and they indicate that all other things 
being held constant, moving up to a more intense ICT improves the likelihood of 
innovation. Todhunter and Abello (2011), Rotaru et a/. (2013), and Tiy et al. (2013) 
also found these results. 


The results for the other control variables indicate the following. First, market 
location plays an important role. Compared to a firm that operates only locally, 
expanding the business operation of local firms to overseas markets positively and 
significantly affects the likelihood of the firm to innovate. Competition has a positive 
effect on innovation, but the results are only significant for the Moderate or Strong 
category. Size is only significant for some of the models. For the dynamic models, the 
coefficients for lagged innovation and initial innovation are both positive and highly 
significant. 


To complement the results and in order to get a better indication of the effects of the 
main variables on innovation, consider also the APEs included in table 4.1. The results 
show that after averaging across all firms and all time periods, having innovated in the 
previous period is associated with an increase of more than eleven percent in the 
propensity to innovate. This indicates that even after controlling for unobserved 
heterogeneity and the other covariates, state dependence plays an important role for 
the likelihood of a firm to innovate. 


7 Note: for models 3 and 5, significance refers to either the significance of the individual coefficients, those of the 
group means, or both. 
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The results also show that with the exception of the Working from Home indicator, all 
the key variables across all models are highly significant. The coefficients for 
collaboration, as well as three of the four flexible working arrangements, are all 
positive. The results also support the importance of ICT intensity. Relative to a most 
intense ICT firm, having any lower ICT intensity significantly decreases the likelihood 
to innovate. In the case of model 5 this is by at least eleven percent. 


4.3 Different types of innovation results 


In this subsection, the study expands the previous results of the dynamic model by 
considering each type of innovation separately. The results are presented in table 4.3. 


As expected, the magnitude and sign of the coefficients differ across the different 
types of innovation. The significance of the coefficients also differs, but there are 
regressors which remain significant across most models. Collaboration is such an 
example, where for all types of innovation the effect is positive and significant, mostly 
at the highest level. 


Amongst the flexible working arrangements indicators there is more variation with 
regards to their significance. The effect of flexible work hours is positive and 
significant for new goods and services, new operational services, and new marketing 
methods. For flexible leave the effect is positive and significant for new operational 
and organisational services. Job sharing also plays an important role in influencing the 
new organisational and operational services, as well as new marketing innovation. 


Similar to the previous results, overall, having a more intense ICT is associated with an 
increase in the likelihood to innovate. Likewise, operating both locally and overseas 
has a positive and mostly significant effect on the propensity to innovate. 


In all cases, initial innovation and innovation lagged are highly significant and positive. 
These findings give support to the hypothesis that innovation is persistent and that 
the initial innovation plays an important role in the analysis, which is in line with what 
one would expect given the short time-frame. 
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4.3 Results for the different types of innovation (Model 5) 


Variables Goods & Services Organisational Operational Marketing 
Innovation (t-1) 0.371 *** 0.482 *** 0.541 *** 0.447 *** 
Innovation (t=O) 0.961 *** 0.637 *** O:672 *** 0.528 *** 
Industry (Manufacturing) 
Mining -0.306 * 0.060 0.030 -0.260 
Construction -0.238 * -0.048 -0.402 *** —0.260 ** 
Wholesale -0.008 -0.045 -0.216 ** -0.067 
Retail -0.004 -0.055 —0.420 *** —0.004 
Accommodation -0.188 0.060 -0.321 *** 0.061 
Transport -0.273 ** -0.098 -0.294 ** -0.249 * 
Telecommunications 0.032 —0.030 -0.262 * -0.020 
Real Estate —-0.457 *** 0.035 —-0.476 *** 0.178 
Professional -0.252 * -0.128 -0.310 *** -0.191 
Administrative -0.376 -0.163 —0.366 ** -0.221 
Recreation -0.295 * —0.180 -0.512 *** -0.114 
Other Services -0.062 0.082 -0.198 0.205 
Size (Very small) 
Small -0.083 0.103 0.148 0.041 
Average -0.182 ** 0.147 ** 0.139 * -0.012 
Flexible work hours 0.164 ** 0.110 0:252°*** 0.142 * 
Flexible leave 0.036 0.092 0.036 0.068 
Job sharing 0.130 0.245 *** 0.169 * 0.246 *** 
Working from home 0.134 0.068 —0.006 0.001 
Competition (No competition) 
Minimal -0.121 0.047 0.034 0.185 
Moderate or Strong -0.007 0.064 0.021 0.163 
ICT Intensity (Most intense) 
Low 0.058 -0.247 -0.054 0.155 
Moderate 0.030 0.001 —0.050 -0.181 
High -0.123 -0.038 -0.133 -0.177 * 
Market Location (Only local) 
Only overseas -0.597 ** -0.450 * -0.458 * -0.240 
Both local and overseas 0.304 *** 0.036 0.124 * 0.157 ** 
Financial year (2007-2008) 
2008-2009 0.227 *** -0.130 *** -0.161 *** —0.085 
2009-2010 -0.191 *** -0.077 -0.185 *** 0.007 
Collaboration 0.283 *** 0.294 *** 0:279° *** 0.168 ** 
Intercept —1.202 *** -1.348 *** 1.431 *** —1.442 *** 
Group Means 
Flexible hours 0.009 0.131 -0.059 0.088 
Flexible leave 0.046 0.276 *** 0.330 *** -0.001 
Job sharing 0.029 —0.094 0.103 0.002 
Working from home -0.014 0.101 0.066 0.099 
Competition (No competition) 
Minimal 0.329 0.222 0.244 -0.119 
Moderate or Strong 0.351 0.011 0.277 0.274 
ICT intensity (Most intense) 
Low -0.332 —-0.284 0.353 -0.985 *** 
Moderate -0.511 *** -0.352 *** 0.288 ** —0.529 *** 
High -0.187 0.130 0.269 * 0.083 
Collaboration 0.225 * 0.048 0.311-*** 0.333 *** 
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4.3 Results for the different types of innovation (Model 5) (continued) 


Variables Goods & Services Organisational Operational Marketing 
Log Likelihood -2,640.8 -2,707.6 -2,594.4 -2,562.9 

AIC 5,365.5 5,499.1 5,272.8 5,209.8 

BIC 5,643.1 5,776.7 5,550.4 5,487.4 
Sigma 0.868 0.616 0.688 0.688 

rho 0.430 *** 0.275 *** 0.321.*** 0.321 *** 
Observations (n) 5,481 5,481 5,481 5,481 


*** = significant at the 0.01 level; ** = significant at the 0.05 level; * = Significant at the 0.10 level, 
Reference category is in brackets. 
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5. CONCLUDING REMARKS 


This is the first longitudinal analysis study conducted on the Main Unit Record File of 
the ABS Business Longitudinal Database. The investigation explores the 
implementation, estimation, and performance of five popular discrete choice 
longitudinal data models. In doing so the analysis extends some of the previous BLD- 
based cross-sectional analyses conducted by the ABS to the longitudinal analysis front. 


Given that a lot of the micro data collected by the ABS is categorical in nature and 
considering the current trend towards the collection and “creation” of richer datasets, 
desirably with both cross-sectional and time dimensions, the methodology developed 
in this study plays and important role and has the potential of being used in other ABS 
outputs. 


The empirical stage is set on the analysis of the effects of four key factors on 
innovation, namely collaboration, flexible working arrangements, information 
technology, and innovation’s own history. 


From an empirical perspective, the study found that all four factors play important 
roles in influencing innovation. Across all models, collaboration was highly significant 
and positive. Flexible working arrangements were also positive and significant (with 
the exception of working from home). However, the results differed depending on 
the type of innovation considered. The results for ICT indicate that all other things 
being held constant, the more technological intense a firm is, the higher its propensity 
to innovate. The coefficient for lagged innovation was highly significant and positive. 


Another finding, which showed up across most of the models, indicates that by 
operating locally as well as globally a firm increases its probability of innovation 
relative to the firms that operate only locally or only overseas. Overall, the results 
support the previous findings of related ABS studies that have examined the same 
relationships using cross-sectional data. Innovation was also found to be persistent. 


On the technicalities of the models, the following can be summarised: (1) the lag 
effects and the initial conditions played key roles in the analysis; (2) controlling for the 
unobserved firm-specific effects was important in the analysis; and (3) allowing for 
some correlation between the firm specific-effects and the regressors was favourable. 


From a methodological perspective, a few things are worth noting. The richness of 
longitudinal data provides great opportunities for analysis and opens the door to a lot 
of analyses that cannot be done on pure cross-sectional or time-series data. However, 
these benefits come at a cost, as the analyst is faced with one major task: extracting 
meaningful and valid inferences from these highly dependent data. 
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With non-linear models this undertaking is usually challenging. The non-additive 
heterogeneity and the serial dependence of the standard errors make the empirical 
analysis difficult and laborious. In these instances, it is often the case that the analyst 
cannot rely on the default standard errors computed by software packages, as they are 
not robust to serial correlation or heterogeneity. Rather, one needs to implement 
alternative methods, such as allowing for clusters in the estimation of the standard 
errors or simulating them via bootstrapping for panel data. This however, requires a 
lot of computation power, as evidence by some of the models in this study where the 
estimation time was in excess of five hours. 


Singer and Willet (2003, Preamble, p. vii) summarises these points well: 


These methods are complex, their statistical models sophisticated, their assumptions 
subtle. The default options in most computer packages do not automatically generate the 
statistical models you need. Thoughtful data analysis requires diligence. But make no 
mistake; hard work has a payoff. 
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APPENDIXES 


A. DATA COMPILATION 


The study utilises firm level data for Australian small- and medium-sized businesses 
covered by the 2007-2008, 2008-2009, and 2009-2010 waves of the BLD.® With a few 
exceptions, the compilation of the variables followed that described in Rotaru et al. 
(2015); 


This section describes the compilation of the variables, beginning with the three main 
variables and then including the other control variables used in the model. 


Innovation 


The different types of innovation were based on the categories included in the Oslo 
Manual (OECD/Eurostat, 2005). Four types of innovation were identified: 


° New goods or services — These include any good or service or combination of 
these which is new to the business (or significantly improved). Its characteristics 
or intended uses differ significantly from those previously produced/offered. 


° New operational processes — These include any new or significantly improved 
methods of producing or delivering goods or services of a business (including 
significant change in techniques, equipment and/or software). 


° New organisational / managerial processes — This includes new or significantly 
improved strategies, structures or routines of a business which aim to improve 
performance. 


e New marketing methods — This includes new or significantly improved designs, 
packaging or selling methods aimed to increase the appeal of goods or services 
or to enter new markets. 


In addition to the four types of innovation, an overall measure of innovation was 
constructed. The overall measure indicates whether a business engaged in any of the 
four types of innovation activity. 


8 Note that the information from all three waves was used in the analysis. For the dynamic models the 
innovation for 2006-2007, which was available from the BCS, was retrieved and used. 
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Flexible working arrangements 


In the context of this paper, ‘flexible working arrangements’ refer to the working 
arrangements offered by businesses to their employees. The different arrangements 
were grouped into four indicators: 


° Flexible working hours — which includes the availability of the employees to deal 
with non-work issues and the selection of own shifts and rosters; 


° Flexible leave— which includes paid parental leave, flexible use of leave (personal 
sick, unpaid or compassionate leave), ability to buy extra annual leave, cash out 
annual leave or take leave without pay; 


° Job sharing — which refers to the availability of job sharing; and 


° Working from home — which refers to the availability of working from home. 


ICT intensity: 


For information technology, an indicator was constructed following Rotaru et al. 
(2013) and Todhunter and Abello (2011). Some slight changes were made to the 
groupings: 


° Most intense — the business has broadband connection, web presence, and 
places or receives orders via the internet; 


e High — the business has broadband connection, web presence, but does not 
place or receive orders via the internet; 


° Moderate — the business has broadband connection but no web presence; and 


e Low — the business does not use broadband connection 


Collaboration: 


The collaboration indicator indicates the presence of the following collaborative 
arrangements: joint research and development, joint buying, joint production of 
goods and services, integrated supply chain, joint marketing or distribution, and other 
collaborative arrangements specified by the business. 
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Other variables: 


The selected business characteristics employed in the models are described below. 


Industry Division: 

e Mining 

e Manufacturing 

° Construction 

e Wholesale 

e Retail trade 

° Accommodation and food service 

° Transport, postal and warehousing 

° Information, media and telecommunications 
° Rental, hiring and real estate services 

e Professional, scientific and technical services 
° Administrative and support services 

e Arts and recreation services 

° Other services 

Number of employees: 

e Very Small: 0-4 employees” 

e Small: 5-19 employees 

° Average: 20-199 employees 


Degree of competition: 


° No competition: 0 competitors 

° Minimal: 1-2 competitors 

e Moderate or Strong: 3 or more competitors 
Financial year: 


° 2007-2008 
° 2008-2009 
e 2009-2010 


Market location: 


° Only local 
e Both local and overseas 
° Only overseas 


9 Note that the number of businesses with zero employees is very small, around 1%. 
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B. AVERAGE PARTIAL EFFECTS (APES) 


This section briefly describes the derivation of the average partial effects (APEs) in the 
context of the Wooldridge Dynamic Model (model 5 in Section 3).'° The 
methodology can be easily modified to derive the APEs for the other models. 


The typical interest is on estimating 


Ho (7,971,673) = E(y xXit = xD a Ve, = 4 (B.1) 


where @ is a vector of parameters, y,,, in the context of this study, is a binary 
variable, and x,,y,_1,¢ are values of interest to the researcher. 


As it is usually unclear which value(s) should be selected for c , rather than selecting 
c ,one can proceed with the estimation of (B.1) by averaging across the distribution 


ofc,. This is indicated below: 


7 
My (x7,97-1:9) = E,, | ey (7,971,658) | (B.2) 
where the expectation is with respect to the unobserved effects. 
From the model specified in (3.3), 
P( Yi | x, e161) = Fey + x,'B + c;) 
and from 
CZ in =V + Zi t+ SoVi0 + & 
it follows that 
H1( 2797-139) = E,, Gt + 217 + Eig + PY; + x; B+ a) (B.3) 


Recall that a; ~ N(0, o;). 


As only the conditional distribution of c,; is assumed in the model, leaving the 
unconditional distribution unspecified, one cannot directly estimate (B.3). Instead 
the law of iterated expectations is used and the left-hand side of expression (B.3) 
becomes 


Hy (2¢7,94-138) = Ezy, is, [e(v + 217 + EyVio + PI-1 + %,'B + a;) zi} (B.4) 


10 For more details see Wooldridge (2010) and Wooldridge (2005). 
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Using the fact that the dependent variable is dichotomous, which implies that the 
expectation is simply the propensity, after some simple mathematics it follows that 


ey, [-] is equal to 
P(Wa + Zia - $a0)i0 - Padi + #, | (B.5) 
| Ya | Ly | 
ig Y 
ale 5 \-0.5 

where E20 | = (1 + o;) & 

Pa Pp 

Ba B 


Substituting this expression into (B.4) all that is left is to find a good estimator for the 
outside expected value, ie. By 5 {+}. 


The following sample counterpart estimates this consistently: 


A * * A 1 = A ra A A * xy A 
real (x;,92-1:8) = res FW + Ziq t $2010 + Payi-1 + %; b.) (B.6) 
i=1 


where the coefficients are the maximum likelihood estimates. 


Using (B.6) one can easily estimate the desired APEs. For instance, the APE fora 
binary variable, w , is given by: 


APEw = fay (xp, 97¢-438 | = 1) = fay (xp,97 438 | = 0) 


In summary, the average partial effect of a discrete variable can be thought of as 
measuring the discrete change in probability averaged over the distribution of the 
unobserved variable(s). Usually this is done by conditioning on a set of values that are 
of interest to the researcher. 


To compute standard errors for the APEs, one can use the delta or the bootstrapping 
method. In this study, panel-robust standard errors were calculated via the 
bootstrapping method. Note however, that due to the nonlinearity and complexity of 
the dynamic random effects model, the bootstrapping simulation had to be done 
manually. 
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C. MODEL MEASURES 


In computing the Akaike information criterion (AIC) and the Bayesian information 
criterion (BIC) the following usual formulae were used: 


AIC = 2k —21n(L) (C.1) 
BIC = k\n(n)-21n(L) (C.2) 
where Rk stands for the number of parameters in the statistical model, 7 is the 
sample size, and LZ is the maximised value of the model likelihood. 
For the estimation of rho , which measures the relative importance of the unobserved 


effects, the following formula was used: 


rho Sa =: (G3) 
Oc + O-g 


‘ 24 . 2: % ‘ 
In this case, o¢ is the variance of the unobserved effects, a; is the variance of the 
+48 F 2 ean : 
idiosyncratic component, and o7 + o7 is the composite error. 
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www.abs.gov.au The ABS website is the best place for data 
from our publications and information about the ABS. 
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for a list of libraries. 
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